AITopics | visual classifier

DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models

Neural Information Processing SystemsJun-21-2026, 20:52:06 GMT

Understanding and explaining the behavior of machine learning models is essential for building transparent and trustworthy AI systems. We introduce DEXTER, a data-free framework that employs diffusion models and large language models to generate global, textual explanations of visual classifiers. DEXTER operates by optimizing text prompts to synthesize class-conditional images that strongly activate a target classifier. These synthetic samples are then used to elicit detailed natural language reports that describe class-specific decision patterns and biases. Unlike prior work, DEXTER enables natural language explanation about a classifier's decision process without access to training data or groundtruth labels. We demonstrate DEXTER's flexibility across three tasks--activation maximization, slice discovery and debiasing, and bias explanation--each illustrating its ability to uncover the internal mechanisms of visual classifiers. Quantitative and qualitative evaluations, including a user study, show that DEXTER produces accurate, interpretable outputs. Experiments on ImageNet, Waterbirds, CelebA, and FairFaces confirm that DEXTER outperforms existing approaches in global model explanation and class-level bias reporting.

classifier, large language model, machine learning, (23 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.67)
Leisure & Entertainment > Sports (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.67)

Add feedback

DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models

Neural Information Processing SystemsJun-13-2026, 20:57:53 GMT

Understanding and explaining the behavior of machine learning models is essential for building transparent and trustworthy AI systems. We introduce DEXTER, a data-free framework that employs diffusion models and large language models to generate global, textual explanations of visual classifiers. DEXTER operates by optimizing text prompts to synthesize class-conditional images that strongly activate a target classifier. These synthetic samples are then used to elicit detailed natural language reports that describe class-specific decision patterns and biases. Unlike prior work, DEXTER enables natural language explanation about a classifier's decision process without access to training data or ground-truth labels. We demonstrate DEXTER's flexibility across three tasks--activation maximization, slice discovery and debiasing, and bias explanation--each illustrating its ability to uncover the internal mechanisms of visual classifiers. Quantitative and qualitative evaluations, including a user study, show that DEXTER produces accurate, interpretable outputs. Experiments on ImageNet, Waterbirds, CelebA, and FairFaces confirm that DEXTER outperforms existing approaches in global model explanation and class-level bias reporting.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

123a18dfd821c8b440f42a00a27648d6-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 01:51:44 GMT

category, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

123a18dfd821c8b440f42a00a27648d6-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 01:25:22 GMT

category, dataset, scenario, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
Europe > Austria > Styria > Graz (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models

Carnemolla, Simone, Pennisi, Matteo, Samarasinghe, Sarinda, Bellitto, Giovanni, Palazzo, Simone, Giordano, Daniela, Shah, Mubarak, Spampinato, Concetto

arXiv.org Artificial IntelligenceNov-18-2025

Understanding and explaining the behavior of machine learning models is essential for building transparent and trustworthy AI systems. We introduce DEXTER, a data-free framework that employs diffusion models and large language models to generate global, textual explanations of visual classifiers. DEXTER operates by optimizing text prompts to synthesize class-conditional images that strongly activate a target classifier. These synthetic samples are then used to elicit detailed natural language reports that describe class-specific decision patterns and biases. Unlike prior work, DEXTER enables natural language explanation about a classifier's decision process without access to training data or ground-truth labels. We demonstrate DEXTER's flexibility across three tasks-activation maximization, slice discovery and debiasing, and bias explanation-each illustrating its ability to uncover the internal mechanisms of visual classifiers. Quantitative and qualitative evaluations, including a user study, show that DEXTER produces accurate, interpretable outputs. Experiments on ImageNet, Waterbirds, CelebA, and FairFaces confirm that DEXTER outperforms existing approaches in global model explanation and class-level bias reporting. Code is available at https://github.com/perceivelab/dexter.

classifier, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2510.14741

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.87)

Industry:

Government (0.67)
Leisure & Entertainment > Sports (0.46)
Consumer Products & Services (0.46)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies

Neural Information Processing SystemsSep-30-2025, 12:52:55 GMT

Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classifiers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children.

combining machine vision, machine vision and bayesian generalization, visual concept learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.65)

Add feedback

LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

Mirza, M. Jehanzeb, Karlinsky, Leonid, Lin, Wei, Kozinski, Mateusz, Possegger, Horst, Feris, Rogerio, Bischof, Horst

arXiv.org Artificial IntelligenceOct-23-2023

Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed category set) classifiers trained with supervised fine-tuning. In this paper we show, for the first time, how to reduce this gap without any labels and without any paired VL data, using an unlabeled image collection and a set of texts auto-generated using a Large Language Model (LLM) describing the categories of interest and effectively substituting labeled visual instances of those categories. Using our label-free approach, we are able to attain significant performance improvements over the zero-shot performance of the base VL model and other contemporary methods and baselines on a wide variety of datasets, demonstrating absolute improvement of up to 11.7% (3.8% on average) in the label-free setting. Moreover, despite our approach being label-free, we observe 1.3% average gains over leading few-shot prompting baselines that do use 5-shot supervision.

classifier, dataset, lafter, (15 more...)

arXiv.org Artificial Intelligence

2305.18287

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States (0.04)
Europe > United Kingdom (0.04)
Europe > Austria > Styria > Graz (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Filtering Abstract Senses From Image Search Results

Neural Information Processing SystemsApr-6-2023, 13:47:34 GMT

We propose an unsupervised method that, given a word, automatically selects non-abstract senses of that word from an online ontology and generates images depicting the corresponding entities. When faced with the task of learning a visual model based only on the name of an object, a common approach is to find images on the web that are associated with the object name, and then train a visual classifier from the search result. As words are generally polysemous, this approach can lead to relatively noisy models if many examples due to outlier senses are added to the model. We argue that images associated with an abstract word sense should be excluded when training a visual classifier to learn a model of a physical object. While image clustering can group together visually coherent sets of returned images, it can be difficult to distinguish whether an image cluster relates to a desired object or to an abstract sense of the word.

classifier, filtering abstract sense, image search result, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Information Management > Search (0.64)
Information Technology > Sensing and Signal Processing > Image Processing (0.53)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies

Neural Information Processing SystemsApr-6-2023, 12:02:24 GMT

Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms. Current methods typically fail to find the appropriate level of generalization in a concept hierarchy for a given set of visual examples. Recent work in cognitive science on Bayesian models of generalization addresses this challenge, but prior results assumed that objects were perfectly recognized. We present an algorithm for learning visual concepts directly from images, using probabilistic predictions generated by visual classifiers as the input to a Bayesian generalization model. As no existing challenge data tests this paradigm, we collect and make available a new, large-scale dataset for visual concept learning using the ImageNet hierarchy as the source of possible concepts, with human annotators to provide ground truth labels as to whether a new image is an instance of each concept using a paradigm similar to that used in experiments studying word learning in children.

combining machine vision, machine vision and bayesian generalization, visual concept learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.65)

Add feedback

Neural Variational Learning for Grounded Language Acquisition

Pillai, Nisha, Matuszek, Cynthia, Ferraro, Francis

arXiv.org Artificial IntelligenceJul-20-2021

We propose a learning system in which language is grounded in visual percepts without specific pre-defined categories of terms. We present a unified generative method to acquire a shared semantic/visual embedding that enables the learning of language about a wide range of real-world objects. We evaluate the efficacy of this learning by predicting the semantics of objects and comparing the performance with neural and non-neural inputs. We show that this generative approach exhibits promising results in language grounding without pre-specifying visual categories under low resource settings. Our experiments demonstrate that this approach is generalizable to multilingual, highly varied datasets.

baseline, classification, classifier, (15 more...)

arXiv.org Artificial Intelligence

2107.14593

Country: